Curve fitting

Fitting

The purpose of constructing a line of approximation (fitting) - to discover the best model to describe your data and to show where the appearance of new points is most possible.

You have a table of points (X, Y). If you have model representation for these numeric Y( X ) data (Y=f( X,a_j) with parameters a_j, j=1..m), you can find numeric values of a_j that made table-defined curve Y_i( X_i) and model curve Y=f( X,a_j) most similar. This process is known as fitting. If you have not model representation you have to find this curve fit.

In this version of the program a series of points is esteemed as time-invariant flow of points (X, Y). Time (sequence number, parameter Z), X, Y sizes of a point are unused. Methods of approximation (fitting) are used in program:

Regression line f( X ) or f( Y );
Piece linear f( X ) or f( Y );
Logistic functions f( X ) or f( Y );
Fourier approximation;
Neural network;
Non-linear least-square fitting;
Formula;
Parametric curve;
B-spline curve;
Sections ( X, Y );
User defined function.

Built-in Wizard of Approximation will help you to apply a variety of curve fits to your plot.

You can add Plug-In module (DLL) to include your non-linear equations into FindGraph. Example C source code for a 'user model' Plug-In is provided in FindGraph install package. You can find it in the subfolder "UserModels". Alternatively, if you're unaccustomed to writing DLL's we'd be happy to produce a plugin for licensed users at no charge, provided that you can furnish the curve fitting model details.

Moreover you can use your own algorithm as Plug-In. Simple compile DLL with any name and place it in the subfolder "APPR". We shipped EXPPOW.DLL sample (pure C). You can find it in the subfolder "ApprSource". See file "ExpPow.cpp" for details.
Contact us for more information.

Weighting scheme

Data points can be given greater or less influence over the fitting process by assigning a weight to each point. You can specify weights Wi for a set of data points (Xi,Yi, i=1..N) for curve fitting. Four different weighting methods are supported by FindGraph:

No weight: Wi = 1.
Instrumental weights: Wi = 1/Ci^2, where Ci are the error bar sizes stored in error bar column Z.
Statistical: Wi = 1/Yi, or Wi = Yi, or Wi = 1/Xi, or Wi = Xi.
Direct: Wi = Ci, where Ci are stored in column Z.

Regression line

The analytical function from list below is built on method of correlation in interval X, Y you selected.

Polynomial	f( U ) = a0 + a1U + a2U^2 + ...
Hyperbolic	f( U ) = a0 + a1/U + a2/U^2 + ...
Logarithmic	f( U ) = a0 + a1ln( U ) + a1ln( U )^2 + ...
Power	ln( f )= a0 + a1ln( U ) + a1ln( U )^2 + ...
Exponential	ln( f )= a0 + a1U + a2U^2 + ...

The degree of approximating varies from 0 up to 15. The best function of approximating is determined on minimum error (normal deviation) and is selected from a list of analytical functions for a given degree of approximating.

Regression lines are used to graphically display trends in data and to analyze problems of prediction. Such analysis is also called regression analysis. You can extend a regression line in a chart beyond the actual data to predict future values. For logarithmic, power, and exponential regression lines, Findgraph uses a transformed regression model.

Piece linear

The straight lines are constructed with a constant step X (or Y) you selected. Each line is build through mean values of points in interval. The number of steps varies from one up to the number of points.

Logistic function

The analytical S-shaped (sigmoid) function having values in the range you selected. Findgraph uses a transformed regression model.

	f( U ) = Vmin + (Vmax-Vmin) / (1 + exp(a0 + a1*U))
	log(f( U ) - Vmin) = a0 + a1*exp(-U)

See also chapter Non-linear fitting Sigmoidal (Logistic) models.

Fourier

It is built on method of correlation in interval X, Y you selected.

	f( U ) = a0 +a1cos( U/T ) +b1sin( U/T ) +a2cos(2U/T) +b2sin(2U/T) + ...
	T = (Umax - Umin) / 2 / 3.14159

The degree of approximating varies from 0 up to 18.

Neural network

Neural networks are based as they are on a crude low-level model of biological neural systems. We use it only for simple non-linear approximation. All points in interval X, Y you selected, are used for neural network training. Neural network with one hidden layer was used. You can vary the number of neurons from 2 to 20 and select neuron activation function:

	f( U ) = (U - dU) / (abs(U - dU) + a)
	f( U ) = 1. / (1. + exp(-a*abs(U - dU)))
	Parameter 'a' must be in limits 0-5.

As our problem to construct a line of approximating quickly, the number of iterations of training process is limited.

Non-linear fitting

FindGraph supplies a library of over 120 industry-specific formulas. The simplex and gradient algorithms were used for quick nonlinear regression performance. The Wizard of Approximation will help you to apply a variety of curve fits to your plot. You can utilize one of predefined fits and vary the number of parameters from 1 up to 8.

Formula

You can enter your own equation to fit your data and vary the number of parameters from 1 up to 4. FindGraph uses one of the non-linear least-square fitting algorithms, namely Broyden - Fletcher - Goldfarb - Schanno algorithm.

Parametric

FindGraph uses parametric graphs X(u), Y(u) to fit data with curves.

X(U) = X0 + A1*sin(2piU)+...+An*cos(2pin*U)

Y(U) = Y0 + B1*sin(2piU)+...+Dn*cos(2pin*U).

Parameter 'U' varies from 0 to 1.

Find the curve that approximates the data polygon in the sense of least square.

The number of harmonics M varies from 1 up to 60.
A least squares curve approximation for closed curves is possible.

B-splines

Given a set of N+1 data points, a degree p, and a number M, where N > M >= p >= 1, find a B-spline curve of degree p defined by M+1 control points that passes through the first and the last points and this curve satisfies a least square criteria.

The number of control points M varies from 1 up to 60.
The degree p varies from 1 up to 6.
If a number of data points N < 60 an interpolation is possible. In interpolation, the curve passes through all given data points in the given order.
An interpolation and a least squares curve approximation for closed curves are possible.

Sections

The sections of straight lines are built on one of algorithm of the theory of pattern recognition. In the given version the method "dot transformation" (Hough Transformation) was used. The method is based on geometrical matching of input groups of points with standards. The sections of straight lines are selected as the standards. Used parameter - number of clusters of a grid N - is varied from 2 up to 100.

User defined functions

Several Plug-In's are included in FindGraph distributive:

f(U) = V0 + Exp(a*(U-U0))*Pow((U-U0),b); Source code is included.
V(U) = V0 + (U-U0)^M * (A0 + A1*U + ... +AN*U^N)
Parameters:
M = pow of (U-U0) in limits from -5 to +5
N = pow of polynomial approximation in limits from 0 to +5
Lines on Step: V(U, i) =Vi0 + Ai*(1 - exp(-(U-Ui0)/Bi) with Ai,Bi for each step.
Parameters:
Step value, default 2. Number of steps is (Umax - Umin) / Step.
Enhanced, if 1 - nonlinear fitting model
if 0 - simple logistic model, i.e. Yi0 and Ai are fixed
Fixed x0, if 1 - x0i is fixed x0i = min(x) in interval
f(U) = A0 + A1*U + ... +An*U^N;
Find the line that minimizes the perpendicular distance between line and points.
This is so called 'Deming regression'. It fits a polynomial line assuming equal experimental errors in both U and V.
In contrast, ordinary regression assumes that the U values are known precisely and all the experimental error is in V. The calculations require that U and V are in the same units.
Parameters:
If 1, fixed point (U0, V0) on resulting line.
N = pow of polynomial approximation in limits from 0 to +4

You can use your own algorithm as Plug-In. To use, prepare DLL, and copy this DLL to subdirectory 'Appr'. You find example of DLL with source code (pure C) in subdirectory 'ApprSource'.

Our developers do custom work for businesses and organizations who contact us with a need to expand the functionality of our applications.

Best fit function

Built-in Wizard will help you to find the best curve fit. The best function is determined on minimum standard error of the estimate. It is selected from a list of polynomial and Fourier functions and from a library of industry-specific nonlinear functions.

To start Wizard select menu item <Fit><Best Function>.

Fitting Log Window

FindGraph saves all information about data fitting and transformations in Log Fitting Window. To view it select menu item <View><Fitting Log>. You can add your notes and save all information as HTML file.

See examples.

See Linear Regression, Interpolation.

	X(U) = X0 + A1sin(2piU)+...+Ancos(2pin*U)
	Y(U) = Y0 + B1sin(2piU)+...+Dncos(2pin*U).
	Parameter 'U' varies from 0 to 1.